The Efficiency of Histogram-like Techniques for Database Query Optimization

نویسندگان

  • B. John Oommen
  • Luis Rueda
چکیده

One of the most difficult tasks in modern day database management systems is information retrieval. Basically, this task involves a user query, written in a high-level language such as the Structured Query Language, and some internal operations, which are transparent to the user. The internal operations are carried out through very complex modules that decompose, optimize and execute the different operations. We consider the problem of Query Optimization which consists of the system choosing, among many different query evaluation plans (QEPs), the most economical one. Since the number of QEPs increases exponentially as the number of relations involving the query increases, query optimization is a very complex problem. Many estimation techniques have been developed in order to approximate the cost of a QEP. Histogram-based techniques are the most used methods in this context. In this paper, we discuss the efficiency of some of these methods: Equi-width, Equi-depth, the Rectangular Attribute Cardinality Map (R-ACM) and the Trapezoidal Attribute Cardinality Map (T-ACM). These methods are used to estimate the cost of the different QEP, whence they attempt to determine the optimal one. It has been shown that the errors of the estimates from R-ACM and T-ACM are significantly less than the corresponding errors obtained from Equi-width and Equi-depth. This fact has been formally demonstrated using reasonable statistical distributions for the cost of a QEP, the doubly exponential distribution and the normal distribution. For the empirical analysis, we have developed a formal, rigorous prototype model used to analyze these methods on random databases. Our empirical results demonstrate that R-ACM chooses a superior QEP more than two times as often as Equi-width and Equi-depth. Similar results have been obtained for T-ACM when compared to the traditional methods. Indeed, in the most general scenario, we analytically prove that under certain models the better the accuracy of an estimation technique, the greater the probability of choosing the most efficient QEP.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

Rectangular Attribute Cardinality Map: A New Histogram-like Technique for Query Optimization

Current database systems utilize histograms to approximate frequency distributions of attribute values of relations. These are used to efficiently estimate query result sizes and access plan costs. Even though they have been in use for nearly two decades, there has been no significant mathematical techniques (other than those used in statistics for traditional histogram approximations) to study...

متن کامل

Using histograms to estimate answer sizes for XML queries

Estimating the sizes of query results, and intermediate results, is crucial to many aspects of query processing. In particular, it is necessary for effective query optimization. Even at the user level, predictions of the total result size can be valuable in “next-step” decisions, such as query refinement. This paper proposes a technique to obtain query result size estimates effectively in an XM...

متن کامل

Estimating Answer Sizes for XML Queries

Estimating the sizes of query results, and intermediate results, is crucial to many aspects of query processing. In particular, it is necessary for effective query optimization. Even at the user level, predictions of the total result size can be valuable in “next-step” decisions, such as query refinement. This paper proposes a technique to obtain query result size estimates effectively in an XM...

متن کامل

An Empirical Comparison of Histogram-Like Techniques for Query Optimization

We consider the problem of Query Optimization which consists of a database system choosing, among many diierent Query Evaluation Plans (QEP), the most economical one for a given query. Since the number of QEPs increases exponentially with the number of relations involving the query, query optimization is a very complex problem. Many estimation techniques have been developed in order to approxim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Comput. J.

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2002